CODA: Commonsense-Driven Autoregressive Human Interaction Generation

ICLR 2026 - Submission No: 1955

A. Performance shows of CODA (Figure 1)

Two people bow to each other.

The two people hug each other tightly.

InterMask

CODA(Ours)


B. Compare with the State-of-the-Art (Figure 4)

The two people hug each other tightly.

One person sneaks up on the other from behind.

Two people are boxing. One is continuously punching while the other is defending and counterattacking.

InterMask

CODA (ours)


C. Ablation Results (Figure 6)

The first person shakes hands with the second to say hello.

Two persons walk forward while hugging each other.

InterMask

VQ-VAE

W/ LCOM
W/ LCOM & LKTraj

CVQ-VAE

W/ LDM
W/ LGDM

W/ LCOM& LKTraj&LGDM

CODA (ours)


D. More Our Results (Figure 9)


One person approaches the other.

Two people are waving their hands and performing a dance step together.

First person is sitting in a chair, the second takes a step forward with their right foot.
The two are blaming each other and having an intense argument.

Two persons walk forward while hugging each other.

Both people are doing fencing practice, attacking each other with their swords. During the practice, the first person make a short lunge and touches the tip of the sword to the top of the second's head.